|
mruby 4.0.0
mruby is the lightweight implementation of the Ruby language
|
This document describes the internals of mruby's garbage collector for developers working on src/gc.c and related code.
Read this if you are: modifying core data structures that hold object references (and need to add write barriers), debugging memory leaks or GC-related crashes, tuning GC performance for an embedded target, or working on the GC code itself.
For user-facing GC docs, see gc-arena-howto.md (arena usage in C extensions) and memory.md (heap regions).
mruby uses a tri-color incremental mark-and-sweep garbage collector with an optional generational mode. The collector runs in small incremental steps between VM instruction execution, avoiding long pauses.
Every heap-allocated object has a color stored in RBasic::gc_color (3 bits):
| Color | Value | Meaning |
|---|---|---|
| White (A or B) | 1 or 2 | Unmarked, candidate for collection |
| Gray | 0 | Marked, but children not yet scanned |
| Black | 4 | Fully marked and scanned |
| Red | 7 | Static/ROM object, never collected |
The GC uses two white types (A and B) in a flip-flop scheme. At the start of each GC cycle, the meaning of "current white" is flipped by XORing the white bits. This avoids recoloring all live objects at cycle boundaries, which is an O(1) operation instead of O(n).
An object is dead if it still carries the previous cycle's white color.
Objects are allocated from fixed-size heap pages:
Each page holds MRB_HEAP_PAGE_SIZE objects (default 1024). On 64-bit systems, a page is approximately 40 KB (40 bytes per slot).
All mruby object types share the same slot size via a C union:
Free slots use the union space for a freelist pointer.
Each page maintains a singly-linked freelist of available slots. Allocation pops from the freelist; deallocation during sweep prepends to the freelist. The GC tracks pages with free slots in gc->free_heaps for fast allocation.
For embedded systems with fixed memory banks, mrb_gc_add_region() carves heap pages from a user-provided contiguous buffer:
Region pages are never freed by the GC (even if all objects die). When region pages are exhausted, allocation falls back to malloc().
The GC operates as a three-state machine:
Marks objects directly reachable from the VM:
After root scanning, the white color is flipped.
Gray objects are popped from the gray stack and their children marked. Each step processes a limited number of objects:
With default step_ratio = 200 and GC_STEP_SIZE = 1024, the limit is 2048 objects per step. After each step, gc_debt is decremented by the actual number of objects processed, so larger steps repay more debt.
When the gray stack is exhausted, the final marking phase re-marks the arena and global variables to catch objects created during marking, then transitions to sweep.
Iterates through heap pages. For each object:
Sweep is also incremental: gc->sweeps tracks the current page position between steps.
The gray stack is a fixed-size array of object pointers:
When the stack overflows, gray_overflow is set to TRUE. During marking, gc_gray_rescan() scans the entire heap to find any gray objects that could not be pushed. This guarantees correctness at the cost of a full heap scan.
During incremental marking, a black (fully marked) object storing a reference to a white (unmarked) object creates a dangerous edge that could lead to premature collection. Write barriers prevent this.
Used when assigning a specific field:
If parent is black and child is white:
Used when an object has been modified but the specific child is not known:
Paints obj gray and pushes it onto the gray stack for re-scanning.
The arena protects newly created objects from collection before they are stored in a reachable location. Every mrb_obj_alloc() automatically pushes the new object onto the arena.
C extensions must save and restore the arena index when creating many temporary objects to prevent arena overflow:
For long-lived C objects that must survive indefinitely:
These store objects in a global array that is always marked as part of the root set.
See gc-arena-howto.md for detailed usage patterns.
When enabled (default, unless MRB_GC_TURN_OFF_GENERATIONAL is defined), the GC classifies objects into young and old generations.
Only processes young objects. Pages where all objects are old are marked with page->old = TRUE and skipped entirely during sweep. Minor GC always runs to completion in a single step.
A full mark-and-sweep cycle that processes all objects. Triggered when gc->live > gc->oldgen_threshold. Major GC runs incrementally, like the non-generational collector.
After a major GC completes, the collector reverts to minor GC mode. The old-generation threshold is recalculated:
With MAJOR_GC_INC_RATIO = 120, a major GC triggers when live objects exceed 120% of the last major GC's survivors.
From Ruby: GC.generational_mode = true/false.
mrb_obj_alloc() is the core allocation function:
obj_free() performs type-specific cleanup:
The object's type is set to MRB_TT_FREE after freeing.
GC uses a debt-based feedback model to balance allocation rate against collection work. The key field is gc->gc_debt (signed integer):
Each object allocation increments gc_debt by 1. When debt goes positive, mrb_incremental_gc() runs. Each incremental step decrements debt by GC_STEP_SIZE (1024), giving credit for many future allocations.
When a GC cycle completes, credit is calculated from interval_ratio:
With default interval_ratio = 200 and 1000 live objects: credit = (1000/100)*200 - 1000 = 1000, so approximately 1000 allocations can occur before the next GC cycle begins.
When gc->malloc_threshold is set (non-zero), the GC also tracks bytes allocated through mrb_realloc_simple() in gc->malloc_increase. When malloc_increase exceeds malloc_threshold, the counter resets and an incremental GC step runs. This captures memory pressure from large buffers (e.g., long strings) that would otherwise be invisible to the object-count-based debt model.
From Ruby: GC.start.
| Macro | Default | Description |
|---|---|---|
| MRB_HEAP_PAGE_SIZE | 1024 | Objects per heap page |
| MRB_GRAY_STACK_SIZE | 1024 | Gray stack capacity |
| MRB_GC_ARENA_SIZE | 100 | Arena size (fixed mode) or initial size |
| MRB_GC_FIXED_ARENA | off | Use fixed-size arena |
| MRB_GC_TURN_OFF_GENERATIONAL | off | Disable generational mode |
| MRB_GC_STRESS | off | Full GC on every allocation (debug) |
| MRB_GC_STATS | off | Enable GC statistics counters |
| MRB_USE_MALLOC_TRIM | off | Call malloc_trim() after full GC |
From Ruby code:
GC.stat returns a Hash with GC state and statistics:
With MRB_GC_STATS enabled, additional keys are available:
interval_ratio (default 200): Controls how many allocations occur between GC cycles. Higher values reduce GC frequency but increase peak memory. The debt credit after each cycle is (live_after_mark / 100) * interval_ratio - live_after_mark.
step_ratio (default 200): Controls how much work each incremental step performs. Higher values make each step larger, reducing total GC overhead but increasing individual pause times.
step_limit (default 0, unlimited): Caps the maximum work per incremental step regardless of step_ratio. Useful for real-time applications that need bounded pause times. The effective step size is min(step_ratio calculation, step_limit).
malloc_threshold (default 0, disabled): Triggers GC when cumulative malloc/realloc bytes exceed this threshold. Useful when applications allocate large buffers (strings, data objects) that create memory pressure without proportional object count increase.
Allocation-heavy workloads (many short-lived Procs, closures, blocks): GC sweep dominates because of high object churn. Increase interval_ratio to reduce GC frequency:
Higher values (400-600) reduce sweep overhead at the cost of more dead objects accumulating before collection. Values above 600 show diminishing returns. Peak memory usage increases temporarily, but live object count after GC remains the same.
CPU-intensive workloads (numeric computation, recursive methods with no object allocation): GC parameters have negligible impact because GC rarely runs. No tuning needed.
Real-time or latency-sensitive applications: Use step_limit to bound pause times:
This makes GC pauses more predictable but increases total GC overhead (more steps needed per cycle).
Large buffer workloads (reading files, building long strings): Set malloc_threshold to trigger GC when buffer allocations accumulate, even if object count is low:
Use GC.stat to monitor GC behavior at runtime:
If debt is frequently positive during performance-critical sections, increase interval_ratio. If memory usage is too high, decrease it.
| File | Contents |
|---|---|
| src/gc.c | GC implementation |
| include/mruby/gc.h | mrb_gc structure, public GC API |
| include/mruby.h | Arena save/restore macros |